Optimization  Strategies  for  Cognition  and  Autonomy 
in  Mixed  Human-Robot  Teams 


Vaibhav  Srivastava  and  Francesco  Bui lo 

Center  for  Control, 

Dynamical  Systems  &  Computation 
University  of  California  at  Santa  Barbara 
http : //motion. me . ucsb . edu 


AFOSR-DSI  Data-to-Decisions  &  Autonomy  Workshop 
RMIT  University,  Melbourne,  Australia,  July  9th  and  10th,  2012 


Vaibhav  Srivastava  &  FB  (UCSB)  Cognition  and  Autonomy  Management  AFOSR-DSI  Wrkshp  9 j u 1 1 2  1  / 

Two  Critical  Issues 


Photo  courtesy:  The  Wall  Street  Journal 


Optimal  information  aggregation 


®  Which  source  to  observe? 

9  Efficient  search  and  detection 
9  Routing  for  evidence  collection 


9  Optimal  time  allocation? 

9  Optimal  streaming  rate? 

9  Optimal  number  of  operators? 


Cognition  &  Autonomy  Management  System  (CAMS) 

to  optimize  human-robot  team  objective 
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Big  Picture:  Human-robot  decision  dynamics 


Uncertain  environment  surveyed  by  human-UAV  team 
(Courtesy:  Prof.  Kristi  Morgansen) 


UAV  surveillance  (Courtesy:  http://www.modsim.org/) 


In  New  Military,  Data  Overload  Can  Be 
Deadly 

By  THOM  SHANKER  and  MATT  RICHTEL 

■When  military  investigators  looked  into  an  attack  by  American  helicopters  last  Fcbru  ary 
that  left  23  Afghan  civilians  dead,  they  found  that  the  operator  of  a  Predator  drone  had 
failed  to  pass  along  crucial  information  about  the  makeup  of  a  gathering  crowd  of  villagers. 


mistake:  information  overload. 

Data  is  among  the  most  potent  weapons  of  the  2ist  century.  Unprecedented  amounts  of  raw 
information  help  the  miliLarv  dcLcrminc  whaL  Larucls  lo  hiLand  whaL  Lo  avoid.  And 

dgingibased  sensors  have  given  rise  Lo  a  new  class  of  wired  warriors  who  must  filter  the 

urfoimat.ionAcaJuLsomct.m 

http:/ /www.nytimes.  com/201  l/01/17/technology/17bra  in.  html 
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Cognition  and  Autonomy  Management  System 


®  Information  collection  and  aggregation  by  robotic  network 
®  Information  processing  and  decision  making  by  human  operator 


®  Based  on  tasks  in  queue  and  estimated  cognitive  state,  CAMS 
specifies  the  time  the  operator  should  spend  on  each  task 

®  Based  on  the  operator’s  decisions  and  world  estimate,  the  CAMS 
collects  information  from  the  most  pertinent  source 
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General  Human-Automaton  Interaction  Model 
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Outline 


e  Topic  1:  Information  Processing 
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Operator  Cognition  Models 


Yerkes  Dodson  effect 

Yerkes-Dodson  1908 


Evolution  of  probability  of  detection 

Pew  '68 


Q  operator  utilization  ratio  =  linear  dynamical  system 

expected  (unforced)  service  time  =  convex  function  of  utilization 
Y-D  curve  well-established,  e.g.,  validated  by  Savla  et.  al.  '10 


Q  the  evidence  for  decision  making  evolves  as  a  drift-diffusion  process 
the  probability  of  the  correct  decision  is  a  sigmoid  function  of  time 
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Implications  of  Sigmoid  Performance 


Sigmoid  function  and  linear  penalty 


maximize  f(t)—xbt 
t>  o 


Derivative  of  a  sigmoid  function 


Optimal  allocation  v/s  penalty  rate 


-  Optimal  allocation  jumps  down  to  zero  at  critical  penalty  rate 

-  Jump  creates  combinatorial  effects 
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Experimental  Validation  of  Sigmoidal  Performance  in 
Visual  Perception 


7.4 

seconds  left 


Maximum  10  clicks.  Find  all  differences. 


Differences 

Found: 

0 


9  task  =  spot  the  differences 

9  expected  #  detected  differences 

is  linear  function  of  time  (DDM) 

9  probability  to  detect  more  than  60%  diffs 
is  sigmoid  (threshold-based  decision  making) 
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Dynamic  Queue  with  Penalty  and  Situational  Awareness  I 


Operator 


9  Tasks  arrive  as  a  Poisson  process  with  rate  A 

9  Task  7  sampled  from  a  distribution 

reward  w7,  sigmoid  params  (inflection,  slope),  penalty  rate  c7 

9  State  variables:  queue  length  rig  and  utilization  ratio  xg  at  stage  £ 

9  Unforced  service  time  =  Y-D  law  S7(x) 

9  Decision  variables:  duration  allocation  tg,  rest  time  rg,  binary  zg 


Vaibhav  Srivastava  &  FB  (UCSB)  Cognition  and  Autonomy  Management  AFOSR-DSI  Wrkshp  9jull2  12  /  27 


Dynamic  Queue  with  Penalty  and  Situational  Awareness  II 

Average  Reward 


ma?  x  ,lim  7  z((Elw'Hfje(t()]  -  cE[n(](te  +  re)  -  \ 

zitl>ztS{xt_x)  L-*o o  L  j— *  V  Z  / 


System  Dynamics 


E[/7£+i]  =  E[max{l,  —  1  +  Poisson(Az£(fy  +  Cg))}] , 

x£+1  =  (1  -  e_J^  +x(£)e-J^)e-J^\  <E  [xmin,xmax] 


.26- 

§3- 

H 


g0.4r 

*30.2' 


10  15 

Task 


20  25 
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Certainty  Equivalent  Solution 
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Illustrative  Example  II 


Reward  versus  Arrival  Rate 


Benefit  per  unit  task 


Benefit  rate 


Optimal  arrival  rate 

o  Switching  occurs  when  operator  is  expected  to  be  always  non-idle 
9  Designer  may  pick  desired  accuracy  on  each  task  to  design  arrival  rate 
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Weight  Queue  Length  Allocation 


Illustrative  Example  I 


Optimal  Allocations  and  Rest  Time 
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Outline 


§  Topic  2:  Information  Aggregation 
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Cognition  and  Autonomy  Management  System 


o  Information  collection  and  aggregation  by  robotic  network 
9  Information  processing  and  decision  making  by  human  operator 

®  Based  on  tasks  in  queue  and  estimated  cognitive  state,  CAMS 
specifies  the  time  the  operator  should  spend  on  each  task 

9  Based  on  the  operator’s  decisions  and  world  estimate,  the  CAMS 
collects  information  from  the  most  pertinent  source 
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Spatial  Quickest  Detection:  Detection  Delay 


Expected  detection  delay  at  region  k 


E[delayfc(q)] 


e  11  +  ri  —  1 

QkDtfk’  fk) 


(q  T  +  q  Dq) 


Two  stage  quickest  detection  strategy 


O  pick  optimal  q*  =  argmin  n 1  E[delay/c(q)] 

Q  adapt  q*  with  the  evidence  collected  at  each  stage 


Region  Selection  Probability  Likelihood  of  Anomaly 


Vaibhav  Srivastava  &  FB  (UCSB)  Cognition  and  Autonomy  Management  AFOSR-DSI  Wrkshp  9jull2  19  /  27 


Spatial  Quickest  Detection 

Dynamic  Vehicle  Routing  for  Distributed  Surveillance 


9  N  regions,  arbitrary  #  anomalies 
9  an  ensemble  of  CUSUM  algorithms 

9  Tk  =  collection  +  transmission  + 
processing  time  at  region  k 

9  dij  =  distance  between  region  /  and  j 
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Outline 


Q  Topic  3:  Combined  Information  Aggregation  and  Processing 


UCSB  Campus  Map 


Vaibhav  Srivastava  &i  FB  (UCSB) 


Cognition  and  Autonomy  Management 


AFOSR-DSI  Wrkshp  9jull2 


20  /  27 


Cognition  and  Autonomy  Management  System 


o  Information  collection  and  aggregation  by  robotic  network 
9  Information  processing  and  decision  making  by  human  operator 


®  Based  on  tasks  in  queue  and  estimated  cognitive  state,  CAMS 
specifies  the  time  the  operator  should  spend  on  each  task 


9  Based  on  the  operator’s  decisions  and  world  estimate,  the  CAMS 
collects  information  from  the  most  pertinent  source 
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Simultaneous  Information  Aggregation  and  Processing  I 


Critical  Issue: 

human  decisions  are  not  i.i.d. 

=>■  no  closed  form  delay  expression 


Adaptive  Policy  with  Human  Feedback 


9  determine  q*  and  sample  regions 

9  set  operator  performance  at  region  k 

fk(t)  =  7 Tk$(t)  +  (1  -  IT k)fk(t) 

9  determine  optimal  allocation  and  rest  time 

9  update  CUSUM  statistic  using  operator's  decision 

9  go  to  step  1. 
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CUSUM  Stats 


Spatial  Quickest  Detection  with  Human  Input 


9  human  operator  allocates  time  t  to  a  task 

and  decides  on  presence/absence  of  anomaly 

9  decision  in  a  Bernoulli  random  variable  with 

ft «.  if  an  anomaly  is  present, 
if  no  anomaly  is  present. 


Spatial  Quickest  Detection 


9  at  stage  £,  pick  a  region  k  from  stationary  distribution  q 
9  go  to  region  k  and  collect  evidence  yn  and  decision  dec^  G  (0, 1} 

9  update  CUSUM  statistic  for  region  k 

A/c  =  (A/^i  +  log(P(decdfy,  anomaly)/P(decdt£,  no  anomaly))"1" 

9  declare  an  anomaly  at  region  k  if  A/<  >  r\ 
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Simultaneous  Information  Aggregation  and  Processing  II 
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(success)  t)  = 


Outline 


o  Conclusions 
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3rd  IFAC  Workshop  on  Distributed  Estimation  and 
Control  in  Networked  Systems 


NecSys'12,  September  14-15,  2012,  Fess  Parker's  Doubletree  Resort,  Santa  Barbara,  California 


Relevant  Dates  and  Proceedings 

■  Submissions  to  NecSys  12  are  open  as  of  March  25.  Please,  read  the  Information  for 
Authors. 

■  Extended  Papers  submission  deadline:  April  30,  2012 

■  Notice  of  acceptance:  June  14,  2012 

■  Final  version  due:  July  15,  2012 

■  Early  registration  deadline:  July  15,  2012 

■  Hotel  registration  deadline:  August  13,  2012 
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Conclusions  &  Future  Directions 


Conclusions 

9  disciplines:  human  cognitive  performance  models,  dynamics  vehicle 
routing,  decision  making,  dynamic  optimization 

9  simultaneous  information  aggregation  and  processing  architecture 

9  incorporation  of  cognitive  /  situational  awareness  /  autonomy 

9  an  adaptive  strategy  that  collects  evidence  from  regions  with  high 

likelihood  of  anomalies  and  optimally  processes  it 


Future  Directions 

9  experimental  validation  of  models  [ongoing]  and  of  architecture 
9  incorporation  of  fatigue,  learning  and  other  cognitive  models 
9  re-queuing  of  tasks,  preemptive  queues  and  more  general  scenarios 
9  dynamic  anomalies  and  more  complex  detection  tasks 
9  multi-vehicle,  multi-operator,  single-operator  multitasking, 
heterogeneous  scenarios 
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